Hierarchical Heuristic Forward Search in Stochastic Domains

نویسندگان

  • Nicolas Meuleau
  • Ronen I. Brafman
چکیده

ManyMDPs exhibit an hierarchical structurewhere the agent needs to perform various subtasks that are coupled only by a small sub-set of variables containing, notably, shared resources. Previous work has shown how this hierarchical structure can be exploited by solving several sub-MDPs representing the different subtasks in different calling contexts, and a root MDP responsible for sequencing and synchronizing the subtasks, instead of a huge MDP representing the whole problem. Another important idea used by efficient algorithms for solving flat MDPs, such as (L)AO* and (L)RTDP, is to exploit reachability information and an admissible heuristics in order to accelerate the search by pruning states that cannot be reached from a given starting state under an optimal policy. In this paper, we combine both ideas and develop a variant of the AO* algorithm for performing forward heuristic search in hierarchical models. This algorithm shows great performance improvements over hierarchical approaches using standard MDP solvers such as Value Iteration, as well as with respect to AO* applied to a flat representation of the problem. Moreover, it presents a general new method for accelerating AO* and other forward search algorithms. Substantial performance gains may be obtained in these algorithms by partitioning the set of search nodes, and solving a subset of nodes completely before propagating the results to other subsets. In many decision-theoretic planning problems, the agent needs to perform various subtasks that are coupled only by a small sub-set of variables. A good example of this is our main application domain: planetary exploration. In this domain, the agent, an autonomous rover, must gather scientific data and perform experiments at different locations. The information gathering and experiment running task at each site is pretty much self-contained and independent of the other sites, except for two issues: the use of shared resources (such as time, energy and memory), and the state of some instru∗QSS Group Inc. ments that may be used at different locations (for instance, warmed-up or not). The rover has only limited resource for each plan execution phase and it must wisely allocate these resources to different tasks. In most problem instances, the set of tasks is not achievable jointly, and the agent must dynamically select a subset of achievable goals as a function of uncertain events outcome. These problems are often called oversubscribed planning problems. They have a natural two-level hierarchical structure [Meuleau et al., 2006]. At the lower level, we have the tasks of conducting experiments at each site. At the higher level, we have the task of selecting, sequencing and coordinating the subtasks (in the planetary rover domain, this includes the actions of tracking targets, navigating between locations, and warming-up instruments). This hierarchical structure is often obvious from the problem description, but it can also be recognized automatically using factoring methods such as that of [Amir and Englehardt, 2003]. The potential benefit of hierarchical decomposition is clear. We might be able to solve a large problem by solving a number of smaller sub-problems, potentially gaining an exponential speed-up. Unfortunately, the existence of an hierarchical decomposition does not imply the existence of an optimal decomposable policy. Thus, one must settle for certain restricted forms of optimality, such as hierarchical optimality [Andre and Russell, 2002], or consider domains in which compact optimal policies exist, such as those that satisfy the reset assumption of [Meuleau et al., 2006]. This paper formulates an hierarchical forward heuristic search algorithm for And-Or search spaces, which we refer to as Hierarchical-AO* (HiAO*). This algorithm exploits a hierarchical partition of the domain to speed up standard AO* search [Nilsson, 1980]. It may be applied anywhere that AO* may be applied, that is, for all problems that can represented as an acyclic And-Or graph, provided an hierarchical partition of the search space is defined. This could be MDPs or any other search problem. Although there is no formal guarantee that it will result in actual performance improvements, our simulations with the rover domain show that it has a great potential for oversubscribed planning problems. Further work Like standard AO*, HiAO* is not designed to work on problems that contain loops. A similar extension to that in [Hansen and Zilberstein, 2001] is required to handle these problems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Local-Search Algorithm for Forward-Chaining Planning

Forward-chaining heuristic search is a well-established and popular paradigm for domain-independent planning. Its effectiveness relies on the heuristic information provided by a state evaluator, and the search algorithm used with this in order to solve the problem. This paper presents a new stochastic local-search algorithm for forward-chaining planning. The algorithm is used as the basis of a ...

متن کامل

Generating Admissible Heuristics by Abstraction for Search in Stochastic Domains

Search in abstract spaces has been shown to produce useful admissible heuristic estimates in deterministic domains. We show in this paper how to generalize these results to search in stochastic domains. Solving stochastic optimization problems is significantly harder than solving their deterministic counterparts. Designing admissible heuristics for stochastic domains is also much harder. Theref...

متن کامل

Faster Optimal and Suboptimal Hierarchical Search By

FASTER OPTIMAL AND SUBOPTIMAL HIERARCHICAL SEARCH by Michael Leighton University of New Hampshire, May, 2012 In problem domains for which an informed admissible heuristic function is not available, one attractive approach is hierarchical search. Hierarchical search uses search in an abstracted version of the problem to dynamically generate heuristic values. This thesis makes three contributions...

متن کامل

Heuristics and metaheuristics in forward-chaining planning

Forward-chaining heuristic search is a well-established and popular paradigm for planning. It is, however, characterised by two key weaknesses. First, search is guided by a domain-independent heuristic which although applicable in a wide range of domains, can often give poor guidance. Second, the metaheuristics used to control forward-chaining planning are often weak, using simple local-search ...

متن کامل

An Empirical Analysis of Local Search in Stochastic Optimization for Planner Strategy Selection

Optimization of expected values in a stochastic domain is common in real world applications. However, it is often difficult to solve such optimization problems without significant knowledge about the surface defined by the stochastic function. I n this paper we examine local search techniques to solve stochastic Optimization. I n particular, we analyze assumptions of smoothness upon which these...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007